Avoiding Overfitting, Pt. 2
Penalized Regression
Behavioral Data Science in R II
Unit 2
Module 6
Techniques to avoid overfitting by increasing model bias in exchange for decreased variance.
Mean Squared Error Loss Function: \[ \frac{1}{n}\sum_{i}^{N}(y_i - f(x_i))^2 \]
L1 Regularization (Lasso): \[ \frac{1}{n}\sum_{i}^{N}(y_i - f(x_i))^2 + \sum_{j} |\beta_j| \]
L2 Regularization (Ridge): \[ \frac{1}{n}\sum_{i}^{N}(y_i - f(x_i))^2 + \sum_{j} \beta_j^2 \]
mnr_spec_tune <- multinom_reg(
mode="classification",
engine="glmnet",
penalty = tune(),
mixture = 1
)
folds = vfold_cv(train,v=5)
param_grid = grid_regular(penalty(), levels=50)
tune_wf <- workflow() %>%
add_recipe(rec) %>%
add_model(mnr_spec_tune)
tune_rs <- tune_grid(
tune_wf,
folds,
grid = param_grid,
metrics = metric_set(accuracy)
)
Train accuracy:
[1] 0.8678161
Test accuracy:
[1] 0.7643678
No Penalty

L1 (Lasso)

L2 (Ridge)
